home *** CD-ROM | disk | FTP | other *** search
Wrap
mmmmppppiiiirrrruuuunnnn((((1111)))) mmmmppppiiiirrrruuuunnnn((((1111)))) NNNNAAAAMMMMEEEE mmmmppppiiiirrrruuuunnnn - Runs MPI programs SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS mmmmppppiiiirrrruuuunnnn [[[[_g_l_o_b_a_l__o_p_t_i_o_n_s] _e_n_t_r_y__o_b_j_e_c_t [:_e_n_t_r_y__o_b_j_e_c_t ...] DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN The mmmmppppiiiirrrruuuunnnn command is the primary job launcher for the Message Passing Toolkit (MPT) implementations of MPI. The mmmmppppiiiirrrruuuunnnn command must be used when a user wants to run an MPI application on IRIX or Linux systems. In addition, for IRIX systems to launch MPI programs, Array Services software must be running. MPI implements the MPI 1.2 standard, as documented by the MPI Forum in the spring 1997 release of _M_P_I: _A _M_e_s_s_a_g_e _P_a_s_s_i_n_g _I_n_t_e_r_f_a_c_e _S_t_a_n_d_a_r_d. In addition, certain MPI-2 functions are implemented. However, several MPI implementations available today use a job launcher called mmmmppppiiiirrrruuuunnnn, and because this command is not part of the MPI standard, each implementation's mmmmppppiiiirrrruuuunnnn command differs in both syntax and functionality. You can run an application on the local host only (the host from which you issued mmmmppppiiiirrrruuuunnnn) or distribute it to run on any number of hosts that you specify. The mmmmppppiiiirrrruuuunnnn command accepts the following operands: The _g_l_o_b_a_l__o_p_t_i_o_n_s operand applies to all MPI executable files on all specified hosts. Global options must be specified before local options specific to a host (_e_n_t_r_y__o_b_j_e_c_t). The following global options are supported: GGGGlllloooobbbbaaaallll OOOOppppttttiiiioooonnnn DDDDeeeessssccccrrrriiiippppttttiiiioooonnnn ----aaaa[[[[rrrrrrrraaaayyyy]]]] _a_r_r_a_y__n_a_m_e (This option is supported on IRIX systems only.) Specifies the array to use when launching an MPI application. By default, Array Services uses the default array specified in the Array Services configuration file, ////uuuussssrrrr////lllliiiibbbb////aaaarrrrrrrraaaayyyy////aaaarrrrrrrraaaayyyydddd....ccccoooonnnnffff. ----ccccpppprrrr (This option is supported on IRIX systems only.) Allows users to checkpoint or restart MPI jobs that consist of a single executable file running on a single system. Since MPI launches jobs through Array Services, you must also ensure that the array to which you are submitting contains only the local host. (If you do not specify an array, you must ensure that the default array contains only the local host.) The absence of any host names in the mmmmppppiiiirrrruuuunnnn command indicates that a job is running on a single system. PPPPaaaaggggeeee 1111 mmmmppppiiiirrrruuuunnnn((((1111)))) mmmmppppiiiirrrruuuunnnn((((1111)))) For example, the following command is valid in ksh (OOOOUUUUTTTTFFFFIIIILLLLEEEE is the file to which ssssttttddddoooouuuutttt will be redirected, which may also be ////ddddeeeevvvv////nnnnuuuullllllll): mpirun -v -cpr -np 2 a.out > OUTFILE 2>&1 < /dev/null The following commands are not valid: mpirun -cpr 2 ./a.out : 3 ./b.out mpirun -cpr hosta -np 2 ./a.out>out 2>&1 </dev/null The first one is not valid because it consists of more than one executable file (aaaa....oooouuuutttt and bbbb....oooouuuutttt). The second one is not valid because even if submitted from hhhhoooossssttttaaaa, it specifies a host name. For interactive users, the preferred method of checkpointing the job is by AAAASSSSHHHH. This ensures that all of the user's processes specified in the mmmmppppiiiirrrruuuunnnn command, plus daemons associated with the job, will be checkpointed. You can use the aaaarrrrrrrraaaayyyy(1) command to find the AAAASSSSHHHH of a job. Interactive users should also note that ssssttttddddiiiinnnn, ssssttttddddoooouuuutttt, and ssssttttddddeeeerrrrrrrr should not be connected to the terminal when this option is being used. Use of this option requires Array Services 3.1 or later. The default behavior will allow for jobs to be checkpointed if the above rules for invoking have been followed, but using the ----ccccpppprrrr option is recommended because it provides specific error messages instead of silently disabling. ----dddd[[[[iiiirrrr]]]] _p_a_t_h__n_a_m_e Specifies the working directory for all hosts. In addition to normal path names, the following special values are recognized: .... Translates into the absolute path name of the user's current working directory on the local host. This is the default. ~~~~ Specifies the use of the value of $$$$HHHHOOOOMMMMEEEE as it is defined on each machine. In general, this value can be different on each machine. PPPPaaaaggggeeee 2222 mmmmppppiiiirrrruuuunnnn((((1111)))) mmmmppppiiiirrrruuuunnnn((((1111)))) ----ffff[[[[iiiilllleeee]]]] _f_i_l_e__n_a_m_e Specifies a text file that contains mmmmppppiiiirrrruuuunnnn arguments. ----hhhh[[[[eeeellllpppp]]]] Displays a list of options supported by the mmmmppppiiiirrrruuuunnnn command. ----mmmmiiiisssseeeerrrr (This option is supported on IRIX systems only.) Allows MPI jobs that run on a single system to be submitted to mmmmiiiisssseeeerrrr. The absence of any host names in the mmmmppppiiiirrrruuuunnnn command indicates that a job is running on a single system, and thus can be submitted to mmmmiiiisssseeeerrrr. For example, the following command is valid: miser_submit -q queue -f file mpirun -miser 2 ./a.out : 3 ./b.out The following command is not valid, even if submitted on hhhhoooossssttttaaaa: miser_submit -q queue -f file mpirun -miser hosta 2 ./a.out Use of this option requires Array Services 3.1 or later. ----pppp[[[[rrrreeeeffffiiiixxxx]]]] _p_r_e_f_i_x__s_t_r_i_n_g Specifies a string to prepend to each line of output from ssssttttddddeeeerrrrrrrr and ssssttttddddoooouuuutttt for each MPI process. To delimit lines of text that come from different hosts, output to ssssttttddddoooouuuutttt must be terminated with a new line character. If a process's ssssttttddddoooouuuutttt or ssssttttddddeeeerrrrrrrr streams do not end with a new line character, there will be no prefix associated with the output or error streams of that process from the final new line to the end of the stream. If the MMMMPPPPIIII____UUUUNNNNBBBBUUUUFFFFFFFFEEEERRRREEEEDDDD____SSSSTTTTDDDDIIIIOOOO environment variable is set, the prefix string is ignored. Some strings have special meaning and are translated as follows: * %%%%gggg translates into the global rank of the process producing the output. This is equivalent to the rank of the process in MMMMPPPPIIII____CCCCOOOOMMMMMMMM____WWWWOOOORRRRLLLLDDDD when not running in ssssppppaaaawwwwnnnn ccccaaaappppaaaabbbblllleeee mode. In the latter case, this translates to the rank of the process within the universe specified at job launch. PPPPaaaaggggeeee 3333 mmmmppppiiiirrrruuuunnnn((((1111)))) mmmmppppiiiirrrruuuunnnn((((1111)))) * %%%%GGGG translates into the number of processes in MMMMPPPPIIII____CCCCOOOOMMMMMMMM____WWWWOOOORRRRLLLLDDDD, or, if running in ssssppppaaaawwwwnnnn ccccaaaappppaaaabbbblllleeee mode, the value of the MMMMPPPPIIII____UUUUNNNNIIIIVVVVEEEERRRRSSSSEEEE____SSSSIIIIZZZZEEEE attribute. * %%%%hhhh translates into the rank of the host on which the process is running, relative to the mmmmppppiiiirrrruuuunnnn command line. This string is not relevant for processes started via MMMMPPPPIIII____CCCCoooommmmmmmm____ssssppppaaaawwwwnnnn or MMMMPPPPIIII____CCCCoooommmmmmmm____ssssppppaaaawwwwnnnn____mmmmuuuullllttttiiiipppplllleeee. * %%%%HHHH translates into the total number of hosts in the job. This string is not relevant for processes started via MMMMPPPPIIII____CCCCoooommmmmmmm____ssssppppaaaawwwwnnnn or MMMMPPPPIIII____CCCCoooommmmmmmm____ssssppppaaaawwwwnnnn____mmmmuuuullllttttiiiipppplllleeee. * %%%%llll translates into the rank of the process relative to other processes running on the same host. * %%%%LLLL translates into the total number of processes running on the host. * %%%%wwww translates into the world rank of the process, i.e. its rank in a MMMMPPPPIIII____CCCCOOOOMMMMMMMM____WWWWOOOORRRRLLLLDDDD. When not running in ssssppppaaaawwwwnnnn ccccaaaappppaaaabbbblllleeee mode, this is equivalent to %%%%gggg. * %%%%WWWW translates into the total number of processes in MMMMPPPPIIII____CCCCOOOOMMMMMMMM____WWWWOOOORRRRLLLLDDDD. When not running in ssssppppaaaawwwwnnnn ccccaaaappppaaaabbbblllleeee mode, this is equivalent to %%%%GGGG. * %%%%@@@@ translates into the name of the host on which the process is running. For examples of the use of these strings, first consider the following code fragment: main(int argc, char **argv) { MPI_Init(&argc, &argv); printf("Hello world\n"); MPI_Finalize(); } Depending on how this code is run, the results of running the mmmmppppiiiirrrruuuunnnn command will be similar to those in the following examples: PPPPaaaaggggeeee 4444 mmmmppppiiiirrrruuuunnnn((((1111)))) mmmmppppiiiirrrruuuunnnn((((1111)))) % mpirun -np 2 a.out Hello world Hello world % mpirun -prefix ">" -np 2 a.out >Hello world >Hello world % mpirun -prefix "%g" 2 a.out 0Hello world 1Hello world % mpirun -prefix "[%g] " 2 a.out [0] Hello world [1] Hello world % mpirun -prefix "<process %g out of %G> " 4 a.out <process 1 out of 4> Hello world <process 0 out of 4> Hello world <process 3 out of 4> Hello world <process 2 out of 4> Hello world % mpirun -prefix "%@: " hosta,hostb 1 a.out hosta: Hello world hostb: Hello world % mpirun -prefix "%@ (%l out of %L) %g: " hosta 2, hostb 3 a.out hosta (0 out of 2) 0: Hello world hosta (1 out of 2) 1: Hello world hostb (0 out of 3) 2: Hello world hostb (1 out of 3) 3: Hello world hostb (2 out of 3) 4: Hello world % mpirun -prefix "%@ (%h out of %H): " hosta,hostb,hostc 2 a.out PPPPaaaaggggeeee 5555 mmmmppppiiiirrrruuuunnnn((((1111)))) mmmmppppiiiirrrruuuunnnn((((1111)))) hosta (0 out of 3): Hello world hostb (1 out of 3): Hello world hostc (2 out of 3): Hello world hosta (0 out of 3): Hello world hostc (2 out of 3): Hello world hostb (1 out of 3): Hello world ----ssssttttaaaattttssss Prints statistics about the amount of data sent with MPI calls during the MMMMPPPPIIII____FFFFiiiinnnnaaaalllliiiizzzzeeee process. Data is sent to ssssttttddddeeeerrrrrrrr. Users can combine this option with the ----pppp option to prefix the statistics messages with the MPI rank. For more details, see the MMMMPPPPIIII____SSSSGGGGIIII____ssssttttaaaatttt____pppprrrriiiinnnntttt(3) man page. ----uuuupppp _u__s_i_z_e Specifies the value of the MMMMPPPPIIII____UUUUNNNNIIIIVVVVEEEERRRRSSSSEEEE____SSSSIIIIZZZZEEEE attribute to be used in supporting MMMMPPPPIIII____CCCCoooommmmmmmm____ssssppppaaaawwwwnnnn and MMMMPPPPIIII____CCCCoooommmmmmmm____ssssppppaaaawwwwnnnn____mmmmuuuullllttttiiiipppplllleeee. This field must be set if either of these functions are to be used by the application being launched by mmmmppppiiiirrrruuuunnnn. Setting this field implies the MMMMPPPPIIII job is being run in ssssppppaaaawwwwnnnn ccccaaaappppaaaabbbblllleeee mode. ----vvvv[[[[eeeerrrrbbbboooosssseeee]]]] Displays comments on what mmmmppppiiiirrrruuuunnnn is doing when launching the MPI application. EEEEnnnnttttrrrryyyy OOOObbbbjjjjeeeeccccttttssss _e_n_t_r_y__o_b_j_e_c_t describes a host on which to run a program, and the local options for that host. You can list any number of _e_n_t_r_y__o_b_j_e_c_t entries on the mmmmppppiiiirrrruuuunnnn command line. In the common case (Single Program Multiple Data (SPMD)), in which the same program runs with identical arguments on each host, usually only one _e_n_t_r_y__o_b_j_e_c_t needs to be specified. Each _e_n_t_r_y__o_b_j_e_c_t has the following components: * One or more host names (not needed if you run on the local host) * Number of processes to start on each host * Name of an executable program * Arguments to the executable program (optional) _e_n_t_r_y__o_b_j_e_c_t has the following format: host_list local_options program program_arguments PPPPaaaaggggeeee 6666 mmmmppppiiiirrrruuuunnnn((((1111)))) mmmmppppiiiirrrruuuunnnn((((1111)))) The _h_o_s_t__l_i_s_t operand is either a single host (machine name) or a comma- separated list of hosts on which to run an MPI program. The _l_o_c_a_l__o_p_t_i_o_n_s operand contains information that applies to a specific host list. The following local options are supported: LLLLooooccccaaaallll OOOOppppttttiiiioooonnnn DDDDeeeessssccccrrrriiiippppttttiiiioooonnnn ----ffff[[[[iiiilllleeee]]]] _f_i_l_e__n_a_m_e Specifies a text file that contains mmmmppppiiiirrrruuuunnnn arguments (same as _g_l_o_b_a_l__o_p_t_i_o_n_s.) For more details, see the subsection titled "Using a File For mmmmppppiiiirrrruuuunnnn Arguments" on this man page. ----nnnnpppp _n_u_m__p_r_o_c Specifies the number of processes on which to run. This local option behaves the same as ----nnnnpppp. The _p_r_o_g_r_a_m _p_r_o_g_r_a_m__a_r_g_u_m_e_n_t_s operand specifies the name of the program that you are running and its accompanying options. UUUUssssiiiinnnngggg aaaa FFFFiiiilllleeee ffffoooorrrr mmmmppppiiiirrrruuuunnnn AAAArrrrgggguuuummmmeeeennnnttttssss Because the full specification of a complex job can be lengthy, you can enter mmmmppppiiiirrrruuuunnnn arguments in a file and use the ----ffff option to specify the file on the mmmmppppiiiirrrruuuunnnn command line, as in the following example: mpirun -f my_arguments The arguments file is a text file that contains argument segments. White space is ignored in the arguments file, so you can include spaces and newline characters for readability. An arguments file can also contain additional ----ffff options. LLLLaaaauuuunnnncccchhhhiiiinnnngggg PPPPrrrrooooggggrrrraaaammmmssss oooonnnn tttthhhheeee LLLLooooccccaaaallll HHHHoooosssstttt For testing and debugging, it is often useful to run an MPI program on the local host only without distributing it to other systems. To run the application locally, enter mmmmppppiiiirrrruuuunnnn with the ----nnnnpppp or ----nnnntttt argument. Your entry must include the number of processes to run and the name of the MPI executable file. The following command starts three instances of the application mmmmtttteeeesssstttt, which is passed an arguments list (arguments are optional): mpirun -np 3 mtest 1000 "arg2" You are not required to use a different host in each entry that you specify on the mmmmppppiiiirrrruuuunnnn command. You can launch a job that has two executable files on the same host. In the following example, both executable files use shared memory. PPPPaaaaggggeeee 7777 mmmmppppiiiirrrruuuunnnn((((1111)))) mmmmppppiiiirrrruuuunnnn((((1111)))) mpirun host_a -np 6 a.out : host_a -np 4 b.out Note that for IRIX hosts, both executable files must be compiled as either 32-bit or 64-bit applications. RRRRuuuunnnnnnnniiiinnnngggg PPPPrrrrooooggggrrrraaaammmmssss iiiinnnn SSSShhhhaaaarrrreeeedddd MMMMeeeemmmmoooorrrryyyy MMMMooooddddeeee For running programs in MPI shared memory mode on a single host, the format of the mmmmppppiiiirrrruuuunnnn command is as follows: mpirun -nt [num_tasks] progname The ----nnnntttt option specifies the number of tasks for shared memory MPI. A single UNIX process is run with multiple tasks representing MPI processes. The _p_r_o_g_n_a_m_e operand specifies the name of the program that you are running and its accompanying options. The ----nnnntttt option to mmmmppppiiiirrrruuuunnnn is supported on IRIX and Linux systems for consistency across platforms. However, since the default mode of execution on a single system is to use shared memory, the option behaves the same as if you specified the ----nnnnpppp option to mmmmppppiiiirrrruuuunnnn. The following example runs ten instances of aaaa....oooouuuutttt in shared memory mode on hhhhoooosssstttt____aaaa: mpirun -nt 10 a.out LLLLaaaauuuunnnncccchhhhiiiinnnngggg aaaa DDDDiiiissssttttrrrriiiibbbbuuuutttteeeedddd PPPPrrrrooooggggrrrraaaammmm You can use mmmmppppiiiirrrruuuunnnn to launch a program that consists of any number of executable files and processes and distribute it to any number of hosts. A host is usually a single machine, or, for IRIX systems, can be any accessible computer running Array Services software. For available nodes on systems running Array Services software, see the ////uuuussssrrrr////lllliiiibbbb////aaaarrrrrrrraaaayyyy////aaaarrrrrrrraaaayyyydddd....ccccoooonnnnffff file. You can list multiple entries on the mmmmppppiiiirrrruuuunnnn command line. Each entry contains an MPI executable file and a combination of hosts and process counts for running it. This gives you the ability to start different executable files on the same or different hosts as part of the same MPI application. The following examples show various ways to launch an application that consists of multiple MPI executable files on multiple hosts. The following example runs ten instances of the aaaa....oooouuuutttt file on hhhhoooosssstttt____aaaa: mpirun host_a -np 10 a.out When specifying multiple hosts, the ----nnnnpppp or ----nnnntttt option can be omitted with the number of processes listed directly. The following example launches ten instances of ffffrrrreeeedddd on three hosts. ffffrrrreeeedddd has two input arguments. PPPPaaaaggggeeee 8888 mmmmppppiiiirrrruuuunnnn((((1111)))) mmmmppppiiiirrrruuuunnnn((((1111)))) mpirun host_a, host_b, host_c 10 fred arg1 arg2 The following IRIX example launches an MPI application on different hosts with different numbers of processes and executable files, using an array called tttteeeesssstttt: mpirun -array test host_a 6 a.out : host_b 26 b.out The following example launches an MPI application on different hosts out of the same directory on both hosts: mpirun -d /tmp/mydir host_a 6 a.out : host_b 26 b.out JJJJoooobbbb CCCCoooonnnnttttrrrroooollll It is possible to terminate, suspend, and/or resume an entire MPI application (potentially running across multiple hosts) by using the same control characters that work for serial programs. For example, sending a SSSSIIIIGGGGIIIINNNNTTTT signal to mmmmppppiiiirrrruuuunnnn terminates all processes in an MPI job. Similarly, sending a SSSSIIIIGGGGTTTTSSSSTTTTPPPP signal to mmmmppppiiiirrrruuuunnnn suspends an MPI job and sending a SSSSIIIIGGGGCCCCOOOONNNNTTTT signal resumes a job. SSSSiiiiggggnnnnaaaallll PPPPrrrrooooppppaaaaggggaaaattttiiiioooonnnn It is possible to send some user signals to all processes in an MPI application (potentially running across multiple hosts). Presently, mmmmppppiiiirrrruuuunnnn supports two user-defined signals: SSSSIIIIGGGGUUUURRRRGGGG and SSSSIIIIGGGGUUUUSSSSRRRR1111. To make use of this feature, the MPI program needs to have a signal handler that catches SSSSIIIIGGGGUUUURRRRGGGG or SSSSIIIIGGGGUUUUSSSSRRRR1111. When the SSSSIIIIGGGGUUUURRRRGGGG or SSSSIIIIGGGGUUUUSSSSRRRR1111 signals are sent to the mmmmppppiiiirrrruuuunnnn process ID, the mmmmppppiiiirrrruuuunnnn process will catch the signal and propagate it to all MPI processes. TTTTrrrroooouuuubbbblllleeeesssshhhhoooooooottttiiiinnnngggg Problems you encounter when launching MPI jobs will typically result in a ccccoooouuuulllldddd nnnnooootttt rrrruuuunnnn eeeexxxxeeeeccccuuuuttttaaaabbbblllleeee error message from mmmmppppiiiirrrruuuunnnn. There are many possible causes for this message, including (but not limited to) the following reasons: * The .... is missing from the user's search path. This problem most commonly occurs when the ----nnnnpppp syntax is used. * No permission has been granted to the local host to launch processes on remote hosts. Because MPI references the ....rrrrhhhhoooossssttttssss file for authentication, this can happen even if you are running your job on the same machine. For example, if you specify mmmmppppiiiirrrruuuunnnn _l_o_c_a_l_h_o_s_t 2222 aaaa....oooouuuutttt, MPI will treat _l_o_c_a_l_h_o_s_t as a remote host. The usual solution to this problem is to put the local host name in your ~~~~////....rrrrhhhhoooossssttttssss file. * The working directory is defaulting to $$$$HHHHOOOOMMMMEEEE instead of to $$$$PPPPWWWWDDDD on remote machines; use either MMMMPPPPIIII____DDDDIIIIRRRR or the ----dddd option. PPPPaaaaggggeeee 9999 mmmmppppiiiirrrruuuunnnn((((1111)))) mmmmppppiiiirrrruuuunnnn((((1111)))) * llllooooccccaaaallllhhhhoooosssstttt does not appear in the ////eeeettttcccc////hhhhoooossssttttssss....eeeeqqqquuuuiiiivvvv file (required for ----nnnnpppp syntax). * For IRIX systems, the Array Services daemon (aaaarrrrrrrraaaayyyydddd) has been incorrectly configured; use aaaasssscccchhhheeeecccckkkk to test your configuration. * In general, if aaaarrrrsssshhhheeeellllllll fails, mmmmppppiiiirrrruuuunnnn usually fails as well. LLLLiiiimmmmiiiittttaaaattttiiiioooonnnnssss The following practices will break the mmmmppppiiiirrrruuuunnnn parser: * Using machine names that are numbers (for example, 3333, 111122227777, and so on) * Using MPI applications whose names match mmmmppppiiiirrrruuuunnnn options (for example, ----dddd, ----ffff, and so on) * Using MPI applications that use a colon (::::) in their command-lines. NNNNOOOOTTTTEEEESSSS Running MPI jobs in the background is not supported on IRIX or Linux systems. The mmmmppppiiiirrrruuuunnnn process is still connected to the tty when a job is placed in the background. One of the things that mmmmppppiiiirrrruuuunnnn polls for is input from ssssttttddddiiiinnnn. If it happens to be polling for ssssttttddddiiiinnnn when a user types in a window after putting an MPI job in the background, the job will abort upon receiving a SSSSIIIIGGGGTTTTTTTTIIIINNNN signal. This behavior is intermittent, depending on whether mmmmppppiiiirrrruuuunnnn happens to be looking for and sees any ssssttttddddiiiinnnn input. Currently, there is no solution to this restriction, but for a job that does not use ssssttttddddiiiinnnn, you can redirect ssssttttddddiiiinnnn from ////ddddeeeevvvv////nnnnuuuullllllll, as shown in the following example: Example: mmmmppppiiiirrrruuuunnnn ----nnnnpppp 2222 ....////aaaa....oooouuuutttt <<<< ddddeeeevvvv////nnnnuuuullllllll &&&& RRRREEEETTTTUUUURRRRNNNN VVVVAAAALLLLUUUUEEEESSSS On exit, mmmmppppiiiirrrruuuunnnn returns the appropriate error code to the run environment. SSSSEEEEEEEE AAAALLLLSSSSOOOO mmmmppppiiii(1) tttteeeerrrrmmmmiiiioooo(7) PPPPaaaaggggeeee 11110000